Speed up of compare_values and has_value methods#101
Speed up of compare_values and has_value methods#101mielvds merged 3 commits intodigitalbazaar:linter-jsonldfrom
Conversation
|
Thanks, I'll take a look when I get a chance. May need to finish getting the test suite up-to-date so we can check if the changes are ok. Can you give a brief example of the form of data that was slow? Just one or two properties is fine, not 10000+ :-) I've started setting up a benchmarking system and issues like this are useful target for auto-generating test data inputs. |
|
I'd like to bump this as it fixes major issues we're facing internally, with slow framing on large JSON-LD files. I will try to generate some artificial data (large amount of data, few properties). |
|
Hi! Any plans to include this? We are also facing issues with framing large JSON-LD files. |
7882a62 to
4a8728b
Compare
|
@RinkeHoekstra I know it's been a while, but now that there are spec tests, this performance fix makes a Flattening and Transform test fail: https://github.com/digitalbazaar/pyld/actions/runs/21819124584/job/62947739897?pr=101 |
4a8728b to
e5a419a
Compare
e5a419a to
69e59f3
Compare
|
I have fixed the bugs in the perfomance version; the test-suite passes now. Not all performance gains have been kept, but there is still significant speedup. |
7f93c71 to
181e4de
Compare
|
@dvsrepo if after 5 years, you're still in the position of benchmarking this, please do :) |
|
I ran a quick and dirty benchmark on the transformation tests: |
I noticed very slow performance on the
to_rdfprocedure for a JSON-LD file with several tens of thousands of typed object values for a single property.Running cProfiles, it turned out that there was an inordinate amount of type spent in the methods
compare_valuesandhas_value.This pull request introduces the following changes:
compare_valueswith exception handling rather than if-then statements. Also changed the ordering and removed the boolean comparison between primitive values (It may need to return, but I couldn't understand the reason behind it)has_valuemethod (which calledcompare_valuesso very frequently) to perform checks only once, and only compare values of the same type.There's also a question on the way the
has_valueis implemented: it seems that if thevalueparameter that's passed is an array, it is completely ignored. Is that the correct behavior?